Data Mining for Regulatory Elements in Yeast Genome
نویسندگان
چکیده
We have examined methods and developed a general software tool for finding and analyzing combinations of transcription factor binding sites that occur relatively often in gene upstream regions (putative promoter regions) in the yeast genome. Such frequently occurring combinations may be essential parts of possible promoter classes. The regions upstream to all genes were first isolated from the yeast genome database MIPS using the information in the annotation files of the database. The ones that do not overlap with coding regions were chosen for further studies. Next, all occurrences of the yeast transcription factor binding sites, as given in the IMD database, were located in the genome and in the selected regions in particular. Finally, by using a general purpose data mining software in combination with our own software, which parametrizes the search, we can find the combinations of binding sites that occur in the upstream regions more frequently than would be expected on the basis of the frequency of individual sites. The procedure also finds so-called association rules present in such combinations. The developed tool is available for use through the WWW.
منابع مشابه
A Computational Approach to Reconstructing Gene Regulatory Networks
With the rapid accumulation of gene expression data in publicly accessible databases, computational study of gene regulation has become an obtainable goal. Intrinsic to this task will be data mining tools for inferring knowledge from biological data. In this project, we have developed a new data mining technique in which we adapt the connectivity of a recurrent neural network model by indexing ...
متن کامل63. A Computational Approach to Discover Differential Cooperation of Regulatory Sites in Functionally Related Genes in Yeast Genome
The availability of genome-wide gene expression data provides a unique set of genes from which to decipher the mechanisms underlying the common transcriptional response. A set of transcription factors which bind to target sites regulate the gene transcription cooperatively. The functional-specific combinations are discovered by a statistical approach and the over-represented repetitive elements...
متن کاملThe Repetitive Sequence Database and Mining Putative Regulatory Elements in Gene Promoter Regions
At least 43% of the human genome is occupied by repetitive elements. Moreover, around 51% of the rice genome is occupied by repetitive elements. The analysis of repetitive elements reveals that repetitive elements in our genome may have been very important in the evolutionary genomics. The first part of this study is to describe a database of repetitive elements - RSDB. The RSDB database contai...
متن کاملMining yeast transcriptional regulatory modules from factor DNA-binding sites and gene expression data.
UNLABELLED In eukaryotes, gene expression is controlled by various transcription factors that bind to the promoter regions. Transcription factors may act positively, negatively or not at all. Different combinations of them may also activate or repress gene expression, and form regulatory networks of transcription. Uncovering such regulatory networks is a central challenge in genomic biology. In...
متن کاملCOMPASSS (COMplex PAttern of Sequence Search Software), a simple and effective tool for mining complex motifs in whole genomes
MOTIVATION The complete sequencing of the human genome shows that only 1% of the entire genome encodes for proteins. The major part of the genome is made up of non-coding DNA, regulatory elements and junk DNA. Transcriptional regulation plays a central role in a multitude of critical cellular processes and responses, and it is a central force in the development and differentiation of multicellu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Proceedings. International Conference on Intelligent Systems for Molecular Biology
دوره 5 شماره
صفحات -
تاریخ انتشار 1997